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ABSTRACT 

Applying the standard weighted mean formula, "-i'^i"^ / Z]j '^^'^ i to determine 
the weighted mean of data, nj, drawn from a Poisson distribution, will, on average, 
underestimate the true mean by ~1 for all true mean values larger than ~3 when the 
common assumption is made that the error of the ith observation is = max(y^, 1). 
This small, but statistically significant offset, explains the long-known observation 
that chi-square minimization techniques which use the modified Neyman's statistic, 

= X^iC^-j ~ Vi)'^ / max(ni,l), to compare Poisson-distributed data with model 
values, yt, will typically predict a total number of counts that underestimates the 
true total by about 1 count per bin. Based on my finding that the weighted mean 
of data drawn from a Poisson distribution can be determined using the formula 

[rii + min (nj, 1)] (n^ + l)'"*^ / Yl,i ("-i + 1)^^ ) I propose that a new statistic, 

= X^i V^i + 1) ~ Ui]^ I \p-i + l]i should always be used to analyze Poisson- 

distributed data in preference to the modified Neyman's x^ statistic. I demonstrate the 
power and usefulness of x^ minimization by using two statistical fitting techniques and 
five x^ statistics to analyze simulated X-ray power-law 15-channel spectra with large 
and small counts per bin. I show that x^ minimization with the Levenberg-Marquardt 
or Powell's method can produce excellent results (mean slope errors ^3%) with spectra 
having as few as 25 total counts. 
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1. INTRODUCTION 

The determination of the weighted mean is the fundamental problem for chi-square (x^) 
minimization methods. The goodness-of-fit between an observation of A'^ data values, Xi, with 
errors, cij, and a model, m,, can be determined by using the standard chi-square statistic: 
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The theory of least-squares states that the optimum value of all the parameters of the model are 
obtained when the chi-square statistic is minimized with respect to each parameter simultaneously. 
For example, the standard formula of the weighted mean can be derived by assuming that the 
model is a constant and then solving the equation. 
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The standard weighted- mean formula thus weights every data value, Xi, inversely by its own 
variance (i.e. erf). 

Let us assume that all the data values come from a pure counting experiment where each 
data value, n,, is a random integer deviate drawn from a Poisson ( 1837 ) distribution. 
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with a mean value of /x. Let us also make the common assumption that the error of each data value 
is the square root of the mean of the parent Poisson distribution. Using these transformations, 
Xi ^ Hi and (Ti ^ , we see that Equation (^) becomes 
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which reduces to become the definition of the sample mean: 
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In the limit of a large number of observations of the Poisson distribution P{k; /x), we find that 
Equation ^ will, on average, determine the mean of the parent Poisson distribution for all true 
mean values /j,: 
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to determine the 



Applying the standard weighted mean formula, 
weighted mean of data, ui, drawn from a Poisson distribution, will, on average, determine the 
mean of the parent Poisson distribution for all true mean values if a constant weight is assigned 
to all data values (i.e. o"^^ = constant). 



It is a common practice to assume that the error of a Poisson deviate n is cr = ^/n. 
Unfortunately, this practice causes the standard weighted-mean formula to be undefined for data 
values of zero. A simple solution to this computational problem is to arbitrarily assign a non-zero 
constant error to all Poisson deviates with a value of zero. Let us make the common assumption 
that the error of each data value, n^, is equal to or 1 — whichever is greater. Using the 
following transformations, Xi Ui and cJi ^ max(y^, 1) , we see that Equation (^) becomes 
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In the limit of a large number of observations of the Poisson distribution P{k; jj), we find that 
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where Ei(x) is the exponential integral of x, Ei(x) = — ^ (it = ^ dt (for x > 0), and 7 
is the Euler-Mascheroni constant: 7 = lim„^oo |Z]r=i ~ ln(n) = 0.5772156649- • • (see, e.g., 
Abramowitz & Stegun 1964). 



Let us now investigate the limit of Equation (^) with large Poisson mean values. The 
transformation of Equation (^ to Equation ([To| ) used the power series of Ei(x), 
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which has the following asymptotic expansion: 
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From the following limit, 
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we see that Ei(x) asymptotically approaches the function e^/{x — 1) for large values of x. For 
> 13 this approximation has an error of <1% ; for x > 33 the error is <0.1% . In the limit of 
large mean Poisson values, we see that the numerator of Equation ( p!o|) is dominated by the e'^ 
term while the denominator is dominated by the Ei(//) term which asymptotically approaches the 
value of e^/(/i — 1). We then have come to the surprising conclusion that for Poisson distributions 
with large mean values, limTv^oo [/^n] approaches the value o//i — 1 instead of the expected value of 

Equation (^) can also be investigated graphically. Figure [l|a plots the difference between 
the weighted mean computed using Equation and the true mean for Poisson-distributed data 
with true mean values between 0.001 and 1000. Each open square represents the weighted mean 
of 4x10^ Poisson deviates at each given true mean value. The solid curve through the data [open 
squares in Fig. ||a] is the difference between Equation (10) and the true mean. Note that Equation 
(|lO|) underestimates the true mean by ~1 for large true mean values (as predicted above). 
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Applying the standard weighted mean formula, J2i ^i'^l 
weighted mean of data, ui, drawn from a Poisson distribution, will, on average, underestimate the 
true mean by ^1 for all true mean values larger than ~3 when the common assumption is made 
that the error of the ith observation is ai = max(y^, 1). 



2. THE WEIGHTED MEAN OF POISSON-DISTRIBUTED DATA 



We will now develop a weighted-mean formula for Poisson-distributed data that will, on 
average, determine the true mean of the parent distribution for all true mean values. 



Let us assume that the error of each data value, rij, is equal to ^Jui + 1 instead of max(y^, 1) . 
Using the following transformations, Xi ^ Ui and cjj ^ \/ni + 1 , we see that Equation (pf) becomes 
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In the limit of a large number of observations of the Poisson distribution P{k\ jj), we find that 
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Figure |l|b graphically confirms this finding. Increasing the error estimates from max(y^, 1) to 
\/ni + 1 has only yielded a minor improvement. Notice that the dip in the solid curve in Fig. ||a 
at ~ 6 is not present in the solid curve in Fig. Ufa. A more radical change appears to be required 
in order for us to develop a weighted-mean formula for Poisson-distributed data. 

Let us now add one to all data values and assume that the error of each data value is the 
square root of the new data value. Using these transformations, Xi ^ rii + 1 and cij ^ \/ni + 1 , 
we see that Equation (S) becomes 
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In the limit of a large number of observations of the Poisson distribution P{k] /x), we find that 
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Figure ||c graphically confirms this finding. We have now made significant progress towards our 
goal of developing a weighted-mean formula for Poisson-distributed data. Applying Equation (16) 
to determine the weighted mean of Poisson-distributed data, will, on average, estimate the true 
mean with ^1% errors for true Poisson mean values ^ ^ 5. 

The deviation of the solid curve in Figure |^c from zero can be eliminated by making just a 
minor change to our transformations. Using the same errors as above, o"j y/rii + 1 , but now 
adding one to only those data values that are initially greater than zero, Ui ^ rii + min(nj, 1) , we 
see that Equation (^) becomes 
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In the limit of a large number of observations of the Poisson distribution P{k; /u), we find that 
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Figure |l]d graphically confirms this finding. We have now achieved our goal of developing a 
weighted-mean formula for Poisson-distributed data. Applying Equation ^1^ ) to determine the 
weighted mean of Poisson-distributed data, will, on average, estimate the true mean for all true 
Poisson mean values (/U > 0). 



3. THE xl STATISTIC 



Based on my finding that the weighted mean of data drawn from a Poisson distribution can 
be determined using the formula Yl,i [^i + 1)] + 1)"^ / Z^i {'^i + > I propose 

that, given N observations (nj) and a model {mi), a new statistic, 
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should always be used to analyze Poisson-distributed data in preference to the modified Neyman's 
statistic, 
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7^ max(nj, 1) 

because the weighted-mean formula for the modified Neyman's x^ statistic [//n: Equation (|8|) ] 
systematically underestimates the true mean value of Poisson-distributed data with true mean 
values ^ ^ 0.5 (see Fig. ||a). 

For Poisson-distributed data, it has long been observed that, in many cases, chi-square fits 
using the modified Neyman's x^ statistic and the Pearson's x^ statistic. 
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will underestimate and overestimate the total area, respectively, while the usage of the maximum 
likelihood ratio statistic for Poisson distributions, 
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preserves the total area (e.g., Baker & Cousins 1984 and references therein). 



It has been known for decades that chi-square minimization techniques using the modified 
Neyman's statistic to analyze Poisson-distributed data will typically predict a total number 
of counts (total area) that underestimates the true total counts by about 1 count per bin (e.g., 
Bevington p^69| , Wheaton et al. f99|). The reason why this underestimation occurs is now 
obvious: the application of the modified Neyman's x^ statistic to Poisson-distributed data causes 
the fitted model value at each bin, rrii, to be, on average, underestimated by ~1 count for all 
true Poisson model mean values ^3 . The underestimation of the true mean by one count gives 
a very large 20% error when the true mean of the data is 5 but only a 1% error when the true 
mean of the data is 100. It would clearly be difficult to detect such a small systematic error with 
small samples of Poisson-distributed data with large true mean values. Figure ||a shows that this 
underestimation is real and is easily measurable with large samples of Poisson-distributed data. 

The number of degrees of freedom, commonly represented with the symbol z^, of a chi-square 
minimization problem is the difference between the number of observations (sample size) and the 
number of free parameters (M) of the model: v = N — M. 

The reduced chi-square of the Pearson's x^ statistic is, by definition, the value of Pearson's 
X^ statistic divided by the number of degrees of freedom: 
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On average, the expected reduced chi-square value of a proper x^ statistic with a perfect model 
is one — given a large number of observations. Now let us assume that our data comes from a 
Poisson distribution with a mean value of In this case, the model rrii will be a constant, 
[Equation (^)], which will, on average, have a value, /xp/, given by Equation (0) in the limit of 
a large number of observations (N.B. np/ = //). The model is a constant and therefore there is 
only one degree-of-freedom: M = 1. Given these assumptions, we find that, in the limit of a large 
number of observations, the reduced chi-square of the Pearson's x^ statistic with the model fip is 
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The reduced chi-square of the modified Neyman's statistic is, by definition, the value of 
the modified Neyman's statistic divided by the number of degrees of freedom: 
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Now let us assume that our data comes from a Poisson distribution with a mean value of /x. In this 
case, the model will be a constant, /Un [Equation (^)], which will, on average, have a value, //n'; 
given by Equation (^) in the limit of a large number of observations. Given these assumptions, 
we find that, in the limit of a large number of observations, the reduced chi-square of the modified 
Neyman's statistic with the model is 
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In the limit of large mean Poisson values, we see that the numerator of the first term of Equation 
( pT] ) is dominated by the — e'^ term while the denominator of the first term is dominated by the 
Ei(^) term which asymptotically approaches the value of e^/(/i — 1). We then conclude that the 
reduced chi-square of the Xn statistic applied to a Poisson distribution [Equation (p7|)] approaches 
the value of one for large true Poisson mean values. Figure |2| graphically confirms this finding; 
we see that Equation (27) reaches a value of ~1 only for very large true Poisson mean values 

(m^ioo). 

The reduced chi-square of the new statistic is, by definition, the value of the X7 statistic 
divided by the number of degrees of freedom: 
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Now let us assume that our data comes from a Poisson distribution with a mean value of In 
this case, the model rui will be a constant, /i^ [Equation (18)], which will, on average, have a 
value, /i^/, given by Equation (^) in the limit of a large number of observations (N.B. /i^/ = /i). 
Given these assumptions, we find that, in the limit of a large number of observations, the reduced 
chi-square of the new statistic with the model //-y is 
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Figure ^ shows that the reduced chi-square of the statistic applied to a Poisson distribution 
[Equation ([29|)] approaches the value of one for small true Poisson mean values (i.e. ^ ^7). 
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Figure |3| shows the variance of the reduced chi-square of the Xpj Xn' X7) and X\ statistics as 
a function of the true Poisson mean. This figure was derived by analyzing the data used in Figure 



4. SIMULATED X-RAY POWER-LAW SPECTRA 

I now demonstrate the new statistic by using it to study a dataset of simulated X-ray 
power-law spectra. This dataset is based on my duplication of the simple numerical experiment 
of Nousek &: Shue ( |198S| ). The number of X-ray photons per energy interval (bin) of a X-ray 
power-law spectrum is 

dN = NqE-^ dE . (30) 

Over an energy range, -Emin < E < .Emax keV, the expectation value for the total number of counts 
can be determined as follows 

N = No E-^ dE , (31) 



which implies that 

N 
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Following Nousek & Shue, I chose the slope value of 7 = 2.0 and used the energy range of 
0.095-0.845 keV which was split into 15 equal bins of 0.050 keV per bin. I simulated 10^ X-ray 
spectra for each of the theoretical values used by Nousek & Shue: 25, 50, 75, 100, 150, 250, 500, 
750, 1000, 2500, 5000, and 10"^ photons per spectrum. Figure |^ shows four of the simulated X-ray 
power-law spectra. 



4.1. Powell's Method: Solving for 7 and N using Xp? X7 

I determined the best-fit model parameters 7caic and N^aic for each simulated spectrum with 
Powell's function minimization method^ using the modified Neyman's statistic (xn)> Pearson's 
statistic (xp), and the new statistic. I used the following crude initial guesses: 7 = 0.0 and 
N = 1.3 rii, where rii is the observed number of photons in the ith channel (bin). I computed 
the robust mean (average) and robust standard deviation^ of the ratios 7caic/7 and A^caic/-^ for the 
10^ simulated spectra of each dataset. The results of Powell's method with two free parameters 
(7, A^) using the Xp! X7 statistics are presented in Table || and Figure ^ . The first column, 



^ The primary reference for Powell's minimization method is Powell (1964). More accessible descriptions may be 
found in the numerical-methods literature (e.g., Acton 1970, Gill, Murray & Wright 1981, and Press et al. 1986) 

The robust mean given in all the tables is the mean of all values within two average deviations of the standard 
mean value. The robust standard deviation given in all the tables is 1.55a where a is the standard deviation of all 
values within two average deviations of the standard mean values. 
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N, of Table |T| corresponds to the total theoretical number of counts in the spectrum. The columns 
"7caic/7" and "A^caic/-^^" are the robust mean values of the ratios of the best-fit parameters 
divided by the original value that was used to create the datasets. The parenthetical numbers are 
the robust standard deviations which can be used to determine the significance of the deviation 
from the perfect ratio value of one. For example, the first value of the 2nd column of Table |^ is 
1.002(11) which represents the value of 1.002ib0.011. The deviation of this value from one (i.e. 
0.002) is statistically significant since the error of the mean is only ~0.011/\/ 10^ or ~0. 00011. 

Figure ^ indicates that the new statistic gives the best results. Using a 5% criteria for 
both fitted parameters {'-/,N), we see that the statistic gives good results for spectra with 
^50 photons. By comparison, Pearson's statistic requires ^250 photons and the modified 
Neyman's statistic requires ^750 photons in order to get the same quality of results. Baker & 
Cousins ( 1984| ) noted that, in many cases, fits using the the modified Neyman's statistic 
will underestimate the total number of counts while fits using Pearson's x^ statistic will 
overestimate the total number of counts; both systematic errors are clearly seen in the bottom 
panel of Figure |5[ I stated in the previous section that the usage of the modified Neyman's x^ 
statistic with Poisson-distributed data will typically underestimate the total counts by one count 
per bin. My results for the Xn statistic clearly exhibit this systematic error: the results of the 
ratio Ncsic/N for spectra with N ^ 250 photons (squares in the bottom panel of Fig. ||) are well 
modeled by the function (N — 15) /N where 15 is the number of bins (channels) in our spectra [see 
the dashed curve in the bottom panel of Fig. |5|. 

A comparison of my analysis of 7calc/7 using the modified Neyman's x^ statistic (2nd column 
of Table ||) with the analysis of Nousek & Shue for Powell's method (3rd column of their Table 
3) shows nearly identical results. In my version of this numerical experiment, I used the two 
parameters and 7 while Nousek & Shue used Nq and 7. A comparison of my analysis of Acaic/A^ 
(3rd column of Table |l|) with their Powell's method analysis of Acaic/Ao (2nd column of their 
Table 3) shows that my analysis with Acak/A^ has produced better estimates. This should not be 
surprising because the parameter Aq is not an independent parameter - Aq depends on both the 
slope of the spectrum and the theoretical number of photons in the spectrum. As a general rule, 
one gets better results by solving for independent parameters instead of dependent parameters. 



4.2. Levenberg-Marquardt Method: Solving for 7 and A using Xn? Xp' X 



I determined the best-fit model parameters 7caic and Acaic for each simulated spectrum with 
Levenberg-Marquardt methodQ using the modified Neyman's x^ statistic (xn)i Pearson's x^ 



statistic (xp), and the new xE statistic. I used the previous crude initial guesses: 7 = 0.0 and 



^ The primary references for Levenberg-Marquardt method are Levenberg (1944) and Marquardt ( 1963| ). More 
accessible descriptions may be found in the numerical-methods literature (e.g., Bevington 1969, Gill, Murray & 



Wright 1981, and Press et al. 1986) 
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N = 1.3J2i^'>T'i- I computed the robust mean and robust standard deviation of the ratios 7caic/7 
and Ncaic/N for the 10^ simulated spectra of each dataset. The results of Levenberg-Marquardt 
method with two free parameters {'y,N) using the Xpi X7 statistics are presented in Table ^ 
and Figure P . 



Figure ^ indicates that the new statistic gives the best results. Using a 5% criteria for 



both fitted parameters (7, A'"), we see that the xi statistic gives good results for all the spectra 



(iV ^ 25 photons). By comparison, Pearson's statistic requires ^100 photons and the modified 
Neyman's statistic requires ^500 photons in order to get the same quality of results. 

The results for the Xn statistics are nearly identical with either Powell's method (Table 

H) or the Levenberg-Marquardt method (Table §). This finding refutes the determination by Nousek 



& Shue ( 1989 ) that Powell's method gives more accurate results than the Levenberg-Marquardt 
method. 

The results for Pearson's x^ improved significantly by using the Levenberg-Marquardt 
method instead of Powell's method. An inspection of the individual fits showed that the 
Levenberg-Marquardt method with the Xp statistic produced a best-fit value for that was 
within a one-tenth of one percent of the total number of photons in the spectrum. Needless to 
say, with such an improvement in the determination of A^, a much better estimate for the slope 7 
could be determined. 

This peculiar result tells us something important about this particular minimization problem: 
an excellent estimate of the total number of photons in the best-fit spectrum is the total number 
of photons in the actual spectrum. Thus by setting to be a constant, A^ = X^i^'^j) can 
eliminate one parameter and solve for 7 alone. 



4.3. Powell's Method: Solving for 7 using Xpj X7 

I determined the best-fit model parameter 7caic for each simulated spectrum with Powell's 
function minimization method using the modified Neyman's x^ statistic (xn)) Pearson's x^ 
statistic (xp)) and the new x^ statistic. I set A'' = J2i^ "iT-i and used the crude initial guess of 
7 = 0.0. I computed the robust mean and robust standard deviation of the ratios 7caic/7 for the 
10^ simulated spectra of each dataset. The results of Powell's method with two free parameters 
(7, A^) using the Xnj Xp; X7 statistics are presented in Table ^ and Figure |^ . 

Figure ^ indicates that the new x^ statistic gives the best results. Using a 5% criteria, we 
see that the x^ statistic gives good results for all the spectra (A^ ^ 25 photons). By comparison, 
Pearson's x^ statistic requires ^250 photons and the modified Neyman's x^ statistic requires 
^750 photons in order to get the same quality of results. 

Fitting only for the slope 7 has improved the results for the new x^ statistic and the modified 
Neyman's x^ statistic. The results for Pearson's x^ show no improvement over the two free 
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parameter result. 



4.4. Levenberg-Marquardt Method: Solving for 7 using Xn? Xp' 

I determined the best-fit model parameter 7caic for each simulated spectrum with the 
Levenberg-Marquardt minimization method using the modified Neyman's statistic (xn)) 
Pearson's x^ statistic (xp)> and the new x^ statistic. I set = Y^Y' f^i ^.nd used the crude initial 
guess of 7 = 0.0. I computed the robust mean and robust standard deviation of the ratios 7caic/7 
for the 10^ simulated spectra of each dataset. The results of the Levenberg-Marquardt method 
with one free parameter (7) using the Xnj Xp) X7 statistics are presented in Table Q and Figure ^ 

Figure ^ indicates that Pearson's x^ statistic gives the best results. Using a 5% criteria, 
we see that both the new x^ statistic and the Xp statistic give good results for all the spectra 
[N ^ 25 photons). By comparison, the modified Neyman's x^ statistic still requires ^750 photons 
in order to get the same quality of results. Once again, we note that the results for the X7 and Xn 
statistics are nearly identical with either Powell's method (Table ^) or the Levenberg-Marquardt 
method (Table |). 



4.5. Error Estimates 

One expects the quality of the slope determination to degrade as the total number of photons 
in the X-ray spectra decline. Figure |9| shows the distribution of the best-fit values for the slope 7 
for the faintest spectra with a theoretical total of 100, 50, and 25 photons. As expected, the range 
of best-fit slope values measured for spectra with only N = 25 photons is considerably larger than 
the range of values for spectra with = 100 photons. 

The Levenberg-Marquardt method not only provides best-fit values for parameters but 
it also provides an error estimate (approximately 1 a errors) of those fitted parameters. How 
believable are these error estimates? Figure |l^ shows an analysis of the errors estimated by 
the Levenberg-Marquardt method when the new x^ statistic was used to analyze spectra with 
theoretical totals of 100, 50, and 25 photons. 



The top panel of Figure 10 shows the error analysis of spectra with N = 100 photons. The 
median slope value is 1.989 and the median error estimate is 0.194. A total of 15.87% of the 
spectra have estimates of 7 < 1.789 and 15.87% of the spectra have estimates of 7 > 2.211. 
For a normal distribution, one expects 68.26% of the deviates to be found within one standard 
deviation of the mean. Assuming that the distribution of best-fit 7 values approximates a normal 
distribution, then half of the difference between the 84.13 and 15.87 percentile values of 7 can be 
used as an estimate for the slope error: « (784.13% ~ 7i5.87%)/2 = (2.211 — 1.789)/2 = 0.211 . 
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This value is 8.8% larger than the median Levenberg-Marquardt error estimate; a fractional error 
of 10.6% instead of the predicted 9.8% . 



The middle panel of Figure 10 shows the error analysis of spectra with = 50 photons. The 
median slope value is 2.009 and the median error estimate is 0.301 . A total of 15.87% of the 
spectra have estimates of 7 < 1.732 and 15.87% of the spectra have estimates of 7 > 2.334. This 
gives an estimated slope error of cj^ ^ (784.13% ~ 7i5.87%)/2 = 0.301 . This value is exactly equal 
to the median Levenberg-Marquardt error estimate. 

The bottom panel of Figure |l^ shows the error analysis of spectra with N = 25 photons. 
The median slope value is 2.071 and the median error estimate is 0.484. A total of 15.87% of 
the spectra have estimates of 7 < 1.692 and 15.87% of the spectra have estimates of 7 > 2.570. 
This gives an estimated slope error of (784.13% ~ 7i5.87%)/2 = 0.439 . This value is 9.3% less 
than the median Levenberg-Marquardt error estimate; a fractional error of 21.2% instead of the 
predicted 23.4% . 

The errors estimated by the Levenberg-Marquardt method are seen to be reasonable. Figure 
11 shows the simulated X-ray spectra of Fig. ^ now plotted with Xj fits produced by the 



Levenberg-Marquardt method with one free parmater. The Levenberg-Marquardt method has 
done a good job even with the two faintest spectra which have actual totals of only 28 and 101 
photons. 



4.6. The Xa ^nd Cash's C statistics 



For the sake of completeness, I determined the best-fit model parameter 7caic for each 
simulated spectrum with Powell's function minimization method using the maximum likelihood 
ratio statistic for Poisson distributions, X\ [Equation (^)], and Cash's C statistic, 



C 



N 

2^ [mi 

1=1 

-15 



Ui In (mi)] 



(33) 



[Equation (6) of Cash |l979f| . I set = J2i and used the crude initial guess of 7 = 0.0. 



I computed the robust mean and robust standard deviation of the ratios 7caic/7 for the 10^ 
simulated spectra of each dataset. The results of Powell's method with one free parameter (7) 
using the Xx statistic and Cash's C statistic are presented in Table ^ and Figure 12 . 

Table |5| and the right panel of Figure 12 shows that Cash's C statistic and the maximum 
likelihood ratio statistic for Poisson distributions, S^^^ identical results. This is not surprising 
because Cash's C statistic is a variant of the more well-known Xx statistic which has been 



discussed in the literature for over 70 years (e.g., Neyman & Pearson 1928) 



I also determined the best-fit model parameter 7caic for each simulated spectrum with the 
Levenberg-Marquardt minimization method using the maximum likelihood ratio statistic for 
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Poisson distributions, Xx- I set N = J^j^ f^i ^-iid used the crude initial guess of 7 = 0.0. I computed 
the robust average and robust standard deviation of the ratios 7caic/7 for the 10^ simulated 
spectra of each dataset. The results of the Levenberg-Marquardt method with one free parameter 
(7) using the Xa statistic is presented in Table |6| and Figure ^ . The maximum likelihood 
ratio statistic for Poisson distributions, Xai produces nearly identical results with either Powell's 
method or the Levenberg-Marquardt minimization method. 

Of the two statistics, Xa s-'^d the new x^, which is better? Although Tables ^ and |^ indicate 
that the Xa is slightly better, we see that the actual differences between the distributions presented 
in Figure 12 are really quite negligible when compared with the overall uncertainty caused by 
simple sampling errors (counting statistics) of the simulated X-ray spectra. 



5. SUMMARY 

I have demonstrated that the application of the standard weighted mean formula, 
^iUia^"^ I ) to determine the weighted mean of data, n^, drawn from a Poisson 

distribution, will, on average, underestimate the true mean by ~1 for all true mean values 
larger than ~3 when the common assumption is made that the error of the ith observation 
is (Tj = max(-^/ni, 1). This small, but statistically significant offset, explains the long-known 
observation that chi-square minimization techniques which use the modified Neyman's statistic, 
Xn = ~ max(nj, 1), to compare Poisson-distributed data with model values, in, will 

typically predict a total number of counts that underestimates the true total by about 1 count per 
bin. Based on my finding that the weighted mean of data drawn from a Poisson distribution can 
be determined using the formula + ("-n 1)] + 1) ""^ / + 1) ""^ ) I proposed 

that a new x^ statistic, Xj = J2i + (ni,!) — yif / [rii + 1], should always be used to analyze 
Poisson-distributed data in preference to the modified Neyman's x^ statistic. 

I demonstrated the power and usefulness of x^ minimization by using two statistical fitting 
techniques (Powell's method and the Levenberg-Marquardt method) and five x^ statistics (xn; Xp^ 
X^, Xa' ^'^cl Cash's C) to analyze simulated X-ray power-law 15-channel spectra with large and 
small counts per bin. I showed that Xy minimization with the Levenberg-Marquardt or Powell's 
method can produce excellent results (mean slope errors ^3%) with spectra having as few as 25 
total counts. 

This analysis shows that there is nothing inherently wrong with either the Levenberg- 
Marquardt method or Powell's method in the low-count regime — provided that one uses 
an appropriate x^ statistic for the type of data being analyzed. Given Poisson-distributed 
data, one should always use the new x^ statistic in preference to the modified Neyman's x^ 
statistic because that statistic produces small, but statistically significant, systematic errors with 
Poisson-distributed data. 
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While the new statistic is not perfect, neither is the more well-known Xx statistic (e.g., 
see Figures ^ and |3|). Both statistics have problems in the very-low-count regime. The new 
statistic complements but does not replace the older X\ statistic. Which statistic is "best" will 
generally depend on the particular problem being analyzed. An important difference between 
these two statistics is that the Xx statistic assumes that all data is perfect. With data from perfect 
counting experiments, the statistic may give slightly better results than the new statistic. 
However, data is typically obtained under less-than-perfect circumstances with multiple imperfect 
detectors. The statistic, by definition, is a weighted x^ statistic which makes it easy to assign a 
lower weight to data from poor detectors. Thus in the analysis of real data obtained with noisy 
and imperfect detectors, the x^ statistic may well outperform the classic Xa statistic because 
low-quality data can be given a lower weight instead of being completely rejected. 

Finally, I note in passing that two simple transformations may make it possible to retrofit 
many existing computer implementations (i.e. executable binaries) of Xn minimization algorithms 
to do X7 minimization through the simple expedient of changing the input data from [rij] to 
[ui + min (nj, 1)], and error estimates, cjj, from [max (y^, l)] to [V^-j + IJ • 
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Figure Captions 

Fig. 1. — Analysis of four weighted- mean formulae applied to Poisson-distributed data. Each open 
square represents the weighted mean of 4x10® Poisson deviates at each given true mean value: 
0.001 < /i < 1000. 

(a) The difference between the weighted mean computed using Equation (^), and the 

true mean, /i . The solid curve is the difference between Equation (10) and the true mean: 
|[eM_i][i + Ei(/i)-7-ln(^)]-i}-/.. 



(b) The difference between the weighted mean computed using Equation (|T^, /U^, and the 
true mean, /i . The solid curve is the difference between Equation (15) and the true mean: 
{^[l_e-M]-i_i}_^. 



(c) The difference between the weighted mean computed using Equation (|16D, /djs, and the 
true mean, /i . The solid curve is the difference between Equation ([T7|) and the true mean: 
{^[l_e-M]-i}-^. 

(d) The difference between the weighted mean computed using Equation (|l8|), /x-y, and the true 
mean, n . The solid curve is the difference between Equation (|l^) and the true mean. The difference 
is zero because /^^ is the weighted-mean formula for Poisson-distributed data. 

Fig. 2. — Reduced chi-square {x^/v) as a function of true Poisson mean, /i, for 4 
statistics: Pearson's [Xp = ~ fni)^ /"mi], the modified Neyman's x^ [Xn = Y^f=i{''^i ~ 

rriif' / max(ni, 1)], the new x^ statistic [x^ = Y^f^i{ni+mim{ni, l) — mif'/ (n^ + l)], and the maximum 

likelihood ratio statistic for Poisson distributions = 2Y^f^i {mi — ni + Uiln (ni / rrii)) . The 
Poisson distributions of Figure || were analyzed to produce this plot. The formula for the curve 
connecting the values for modified Neyman's x^ statistic (xn) given in Equation (p7|). The 



formula for the curve connecting the values for new xE statistic is given in Equation (p9|). The 



^7 

dotted line shows the ideal value of one. 

Fig. 3. — The variance of the reduced chi-square as a function of true Poisson mean, for 

5 x^ statistics: the standard x^j Pearson's x^ (Xp); the modified Neyman's x^ (Xn)' X7 
statistic, and the maximum likelihood ratio statistic for Poisson distributions (Xa)- The Poisson 
distributions of Figure ^ were analyzed to produce this plot. The formula for the variance of the 
reduced Pearson's x^ statistic is 2 -|- fi^^. The dotted line shows the ideal value of two. 

Fig. 4. — The dashed lines show 4 ideal X-ray power-law spectra with a total of 25, 100, 1000, and 
10000 photons. Four simulated X-ray spectra with totals of 28, 101, 1015, and 9938 photons are 



shown with la error bars estimated with Equations (9) and (14) of Gehrels (1986). (N.B. Some 
errorbars overlap and the bottom two spectra have identical data values at the 0.47 and 0.67 keV 
bins.) 

Fig. 5. — Results of Powell's method with two free parameters (7, A^) for three statistics: X7 
(circles), Xn (squares), and Xp (triangles). This figure uses the data given in Table |^. The dotted 
lines show the ideal ratio value of one. The dashed curve in the bottom panel shows the function 
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(A^ — 15)/A^ which is a good model for the A'^caic/-^ results of the Xn statistic for all spectra with 
N ^ 250 photons. 

Fig. 6. — Results of the Levenberg-Marquardt method with two free parameters (7, N) for three 
statistics: X7 (circles), Xn (squares), and Xp (triangles). This figure uses the data given in Table 
^. The dotted lines show the ideal ratio value of one. The dashed curve in the bottom panel shows 
the function (A^ — 15) /N which is a good model for the A^caic/-^ results of the Xn statistic for all 
spectra with ^ 250 photons. 

Fig. 7. — Results of Powell's method with one free parameter (7) for three statistics: X7 (circles), 
Xn (squares), and Xp (triangles). This figure uses the data given in Table ^. The dotted lines show 
the ideal ratio value of one. 

Fig. 8. — Results of the Levenberg-Marquardt method with one free parameter (7) for three 
statistics: X7 (circles), Xn (squares), and Xp (triangles). This figure uses the data given in Table 
^. The dotted lines show the ideal ratio value of one. 

Fig. 9. — A comparison of the results of the analysis of the simulated X-ray spectra with theoretical 
totals of 100, 50, and 25 photons using the Levenberg-Marquardt method with 1 free parameter 
(left) and Powell's method with 1 free parameter (right). Note that the histograms for the X7 and 
Xn are nearly identical for both methods. The statistical analysis of this data is presented in Tables 
I and |. 

Fig. 10. — Error analysis of the Levenberg-Marquardt method results using the X7 statistic with 
one free parameter. The thick curve in each panel shows the cumulative distribution of the best-fit 
estimates of the slope 7. The right (left) thin curve in each panel shows the cumulative distribution 
of 7 plus (minus) a-y which is the error estimate of the best-fit slope value. The statistical analysis 
of this data is presented in Table ^ 

Fig. 11. — The simulated X-ray spectra of Fig. ^ now plotted with X7 fits. The best fits are shown 
with solid curves. The upper and lower 1 a slope estimates are shown with long dashed curves. 

Fig. 12. — A comparison of the results of the Levenberg-Marquardt method with 1 free parameter 
(left) and Powell's method with 1 free parameter (right) for three statistics: X7) Xaj Cash's C. The 
statistical analysis of this data is presented in Tables |3|, |5|, and 



- 22 - 



I I lllllll 

t(a) 



1 - 







CO 




CD 




a 




CD 








u 




-1-5 


-1 - 



TTTIII| — I I llllll| — I I llllll| — I I llllll| 

^^ n,/max(n,,l) 



nmiii 



LLUlllL 



1 - 



a. 



l/max(n,,l) 




I 











I 



LLUIUI 





-1 r M> = 



_ Si [n,+min(n,.l)]/[n,+ l]: I 



El l/[n,+ l] 



LLUlllI 



LLlillll 



LLUlllI 



LLUlllI 



LLUlllI 



LLUlllI 



- 1 



I 



_ E. [n,+ l]/[n,+ l ] 
E, l/[n,+ l] 



1 



LLUml 



LLUlllI 



LLUlllI 



LLUml 



LLUlllI 



LLUlllI 



0.01 0.1 1 



10 100 



0.01 0.1 1 



10 100 



Figure 1 of Mighell (1999) 



true mean : /ll 



I I I I llll| 1 I I I llll| 1 I I I llll| 1 I I I llll| 1 I I I llll| 1 I I I nil 




0.001 0.01 0.1 1 10 100 1000 

true mean : ijl 

Figure 2 of Mighell (1999) 



1000 BT-r-TT TTTT1| 1 I I I llll| 1 I I I llll| 1 I I I llll| 1 I I I llll| 1 I I I IIW 



: \ 

F \ y2 
\Ap 

\ 



« 100 b- 



0.001 k: 




mu ™ 



0.001 0.01 



Figure 3 of Mighell (1999) 



0.1 1 10 

true mean : /x 



100 1000 



: 10,000 photons 



1000 - 



: 1,000 photons 



100 - 



: 100 photons 




Figure 4 of Mighell (1999) 



Powell's Method 




10 100 1000 104 

N 



Figure 5 of Mighell (1999) 



Levenberg-Marquardt Method 




10 100 1000 104 

N 



Figure 6 of Mighell (1999) 



Powell's Method 




N 

Figure 7 of Mighell (1999) 



Levenberg-Marquardt Method 




N 

Figure 8 of Mighell (1999) 



Levenberg-Marquardt Method Powell's Method 




Figure 9 of Mighell (1999) 



I 1 I I I I II 




3 

£ 

:3 1 
a 



_l 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 J 


M i 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 1 1 j.,.j.,a,a.a.,.x.j 




: yn oSilF- 

: ^ 7+o^ 


"oi58? "^"^^ 




U-^ir-^r-r-^^ 


..✓^'1 1237 25 photons - 

1 ■ ■ 1 : 1 1 1 1 1 1 1- 



7 



Figure 10 of Mighell (1999) ~ low resolution 




Figure 11 of Mighell (1999) 



Levenberg-Marquardt Method Powell's Method 




Figure 12 of Mighell (1999) 



Table 1. 

Results of Powell's method with 2 free parameters (7,A^) for 3 statistics: x^, Xp, 



N xl xl xl 





7calc/7 


iVcalc/iV 


7calc/7 


iVcalc/iV 


7calc/7 


iVcalc/iV 


10000 


1.002(11) 





999(12) 





999(11) 


1.000(12) 


1.000(11) 


1.000(12) 


5000 


1.003(15) 





998(17) 





998(15) 


1.001(17) 


1.000(15) 


1.000(17) 


2500 


1.007(23) 





994(24) 





996(22) 


1.002(24) 


0.999(22) 


1.000(24) 


1000 


1.019(37) 





987(39) 





992(34) 


1.007(39) 


0.999(35) 


1.003(39) 


750 


1.025(45) 





981(45) 





989(39) 


1.008(43) 


0.998(41) 


1.003(44) 


500 


1.040(58) 





971(56) 





984(47) 


1.013(54) 


0.996(51) 


1.005(55) 


250 


1.071(82) 





946(79) 





969(67) 


1.025(77) 


0.992(80) 


1.009(81) 


150 


1.09(10) 





92(10) 





952(84) 


1.04(10) 


0.99(10) 


1.01(11) 


100 


1.10(13) 





91(13) 





93(10) 


1.06(12) 


0.99(13) 


1.02(13) 


75 


1.11(15) 





89(14) 





92(12) 


1.07(14) 


0.99(15) 


1.03(15) 


50 


1.11(20) 





87(17) 





89(14) 


1.10(18) 


0.99(19) 


1.04(19) 


25 


1.17(50) 





84(24) 





82(19) 


1.18(26) 


1.05(40) 


1.07(28) 
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Table 2. 

Results of the Levenberg-Marquardt method with 2 free parameters (7,A^) for 3 statistics: x^, Xp, 



N xl xl xl 





7calc/7 


iVcalc/iV 


7calc/7 


iVcalc/iV 


7calc/7 


iVcalc/iV 


10000 


1.002(11) 





999(12) 


1 


000(11) 


1.000(12) 


1.000(12) 


1.000(11) 


5000 


1.004(15) 





998(17) 


1 


000(15) 


1.000(17) 


1.000(15) 


1.000(17) 


2500 


1.007(23) 





994(24) 





997(22) 


1.000(24) 


0.999(22) 


1.000(24) 


1000 


1.019(37) 





987(39) 





995(34) 


1.000(38) 


0.999(35) 


1.003(39) 


750 


1.025(45) 





981(45) 





995(38) 


1.000(43) 


0.998(41) 


1.003(44) 


500 


1.040(58) 





971(56) 





995(46) 


1.000(54) 


0.996(51) 


1.005(55) 


250 


1.071(82) 





946(79) 





994(67) 


1.000(76) 


0.992(80) 


1.009(81) 


150 


1.09(10) 





92(10) 





986(91) 


0.994(97) 


0.99(10) 


1.01(11) 


100 


1.10(13) 





91(13) 





96(12) 


1.00(12) 


0.99(13) 


1.02(13) 


75 


1.11(15) 





89(14) 





93(13) 


1.00(14) 


0.99(15) 


1.03(15) 


50 


1.11(20) 





87(17) 





90(14) 


0.99(17) 


0.99(19) 


1.04(19) 


25 


1.12(35) 





84(24) 





88(17) 


0.99(23) 


1.01(29) 


1.07(28) 
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Table 3. 

Results of Powell's method with 1 free parameter (7) for 3 statistics: x^, Xp, 



N xl Xp Xl 



7calc/7 7calc/7 7calc/7 

10000 1.002(11) 0.999(11) 1.000(11) 

5000 1.003(15) 0.998(15) 1.000(15) 

2500 1.007(23) 0.996(22) 0.999(22) 

1000 1.019(37) 0.992(34) 0.999(35) 

750 1.025(45) 0.989(39) 0.998(41) 

500 1.040(58) 0.984(47) 0.996(51) 

250 1.070(82) 0.969(67) 0.992(80) 

150 1.09(11) 0.952(84) 0.99(10) 

100 1.10(13) 0.93(10) 0.99(13) 

75 1.11(15) 0.92(12) 1.00(15) 

50 1.10(19) 0.89(14) 1.00(18) 

25 1.06(31) 0.82(19) 1.03(27) 
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Table 4. 

Results of the Levenberg-Marquardt method with 1 free parameter (7) for 3 statistics: x^, Xp, 



N xl Xp Xl 



7calc/7 7calc/7 7calc/7 

10000 1.002(11) 1.000(11) 1.000(11) 

5000 1.004(15) 1.000(15) 1.000(15) 

2500 1.007(23) 0.999(22) 0.999(22) 

1000 1.019(37) 0.994(34) 0.999(35) 

750 1.025(45) 0.992(39) 0.998(41) 

500 1.040(58) 0.990(48) 0.996(51) 

250 1.070(82) 0.988(68) 0.992(80) 

150 1.09(10) 0.988(87) 0.99(10) 

100 1.10(13) 0.99(11) 0.99(13) 

75 1.10(15) 0.99(12) 1.00(15) 

50 1.10(19) 0.99(16) 1.00(18) 

25 1.06(31) 0.99(22) 1.03(27) 
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Table 5. 

Results of Powell's method with 1 free parameter (7) for 2 statistics: x^, Cash's C 



N xl Cash's C 



7calc/7 7calc/7 

10000 1.000(11) 1.000(11) 

5000 1.000(15) 1.000(15) 

2500 1.000(22) 1.000(22) 

1000 1.000(34) 1.000(34) 

750 1.000(39) 1.000(39) 

500 1.000(48) 1.000(48) 

250 0.999(68) 0.999(68) 

150 0.999(88) 0.999(88) 

100 1.00(11) 1.00(11) 

75 1.00(12) 1.00(12) 

50 1.00(16) 1.00(16) 

25 1.00(22) 1.00(22) 
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Table 6. 

Results of the Levenberg-Marquardt method with 1 free parameter (7) for the x'x statistic 



N 


xl 




7calc/7 


10000 


1.000(11) 


5000 


1.000(15) 


2500 


1.000(22) 


1000 


1.000(34) 


750 


1.000(39) 


500 


1.000(48) 


250 


0.999(68) 


150 


0.999(88) 


100 


1.00(11) 


75 


1.00(12) 


50 


1.00(16) 


25 


1.00(22) 
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